Scale Invariant Multi-length Motif Discovery
نویسندگان
چکیده
Discovering approximately recurrent motifs (ARMs) in timeseries is an active area of research in data mining. Exact motif discovery was later defined as the problem of efficiently finding the most similar pairs of timeseries subsequences and can be used as a basis for discovering ARMs. The most efficient algorithm for solving this problem is the MK algorithm which was designed to find a single pair of timeseries subsequences with maximum similarity at a known length. Available exact solutions to the problem of finding top K similar subsequence pairs at multiple lengths (which can be the basis of ARM discovery) are not scale invariant. This paper proposes a new algorithm for solving this problem efficiently using scale invariant distance functions and applies it to both real and synthetic dataset.
منابع مشابه
Efficient Discovery of Variable-length Time Series Motifs with Large Length Range in Million Scale Time Series
Detecting repeated variable-length patterns, also called variable-length motifs, has received a great amount of attention in recent years. Current state-of-the-art algorithm utilizes fixed-length motif discovery algorithm as a subroutine to enumerate variable-length motifs. As a result, it may take hours or days to execute when enumeration range is large. In this work, we introduce an approxima...
متن کاملEfficient Discovery of Time Series Motifs with Large Length Range in Million Scale Time Series
Detecting repeated variable-length patterns, also called variable-length motifs, has received a great amount of attention in recent years. Current state-of-the-art algorithm utilizes fixed-length motif discovery algorithm as a subroutine to enumerate variable-length motifs. As a result, it may take hours or days to execute when enumeration range is large. In this work, we introduce an approxima...
متن کاملToward Unsupervised Activity Discovery Using Multi-Dimensional Motif Detection in Time Series
This paper addresses the problem of activity and event discovery in multi dimensional time series data by proposing a novel method for locating multi dimensional motifs in time series. While recent work has been done in finding single dimensional and multi dimensional motifs in time series, we address motifs in general case, where the elements of multi dimensional motifs have temporal, length, ...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملG-SteX: Greedy Stem Extension for Free-Length Constrained Motif Discovery
Most available motif discovery algorithms in real-valued time series find approximately recurring patterns of a known length without any prior information about their locations or shapes. In this paper, a new motif discovery algorithm is proposed that has the advantage of requiring no upper limit on the motif length. The proposed algorithm can discover multiple motifs of multiple lengths at onc...
متن کامل